Woodlands County
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy
Smith, Michael J., Geach, James E.
In this review, we explore the historical development and future prospects of artificial intelligence (AI) and deep learning in astronomy. We trace the evolution of connectionism in astronomy through its three waves, from the early use of multilayer perceptrons, to the rise of convolutional and recurrent neural networks, and finally to the current era of unsupervised and generative deep learning methods. With the exponential growth of astronomical data, deep learning techniques offer an unprecedented opportunity to uncover valuable insights and tackle previously intractable problems. As we enter the anticipated fourth wave of astronomical connectionism, we argue for the adoption of GPT-like foundation models fine-tuned for astronomical applications. Such models could harness the wealth of high-quality, multimodal astronomical data to serve state-of-the-art downstream tasks. To keep pace with advancements driven by Big Tech, we propose a collaborative, open-source approach within the astronomy community to develop and maintain these foundation models, fostering a symbiotic relationship between AI and astronomy that capitalizes on the unique strengths of both fields.
PADLoC: LiDAR-Based Deep Loop Closure Detection and Registration Using Panoptic Attention
Arce, Josรฉ, Vรถdisch, Niclas, Cattaneo, Daniele, Burgard, Wolfram, Valada, Abhinav
A key component of graph-based SLAM systems is the ability to detect loop closures in a trajectory to reduce the drift accumulated over time from the odometry. Most LiDAR-based methods achieve this goal by using only the geometric information, disregarding the semantics of the scene. In this work, we introduce PADLoC for joint loop closure detection and registration in LiDAR-based SLAM frameworks. We propose a novel transformer-based head for point cloud matching and registration, and to leverage panoptic information during training time. In particular, we propose a novel loss function that reframes the matching problem as a classification task for the semantic labels and as a graph connectivity assignment for the instance labels. During inference, PADLoC does not require panoptic annotations, making it more versatile than other methods. Additionally, we show that using two shared matching and registration heads with their source and target inputs swapped increases the overall performance by enforcing forward-backward consistency. We perform extensive evaluations of PADLoC on multiple real-world datasets demonstrating that it achieves state-of-the-art results. The code of our work is publicly available at http://padloc.cs.uni-freiburg.de.
Utterance Weighted Multi-Dilation Temporal Convolutional Networks for Monaural Speech Dereverberation
Ravenscroft, William, Goetze, Stefan, Hain, Thomas
Speech dereverberation is an important stage in many speech technology applications. Recent work in this area has been dominated by deep neural network models. Temporal convolutional networks (TCNs) are deep learning models that have been proposed for sequence modelling in the task of dereverberating speech. In this work a weighted multi-dilation depthwise-separable convolution is proposed to replace standard depthwise-separable convolutions in TCN models. This proposed convolution enables the TCN to dynamically focus on more or less local information in its receptive field at each convolutional block in the network. It is shown that this weighted multi-dilation temporal convolutional network (WD-TCN) consistently outperforms the TCN across various model configurations and using the WD-TCN model is a more parameter efficient method to improve the performance of the model than increasing the number of convolutional blocks. The best performance improvement over the baseline TCN is 0.55 dB scale-invariant signal-to-distortion ratio (SISDR) and the best performing WD-TCN model attains 12.26 dB SISDR on the WHAMR dataset.
Deep-seismic-prior-based reconstruction of seismic data using convolutional neural networks
Liu, Qun, Fu, Lihua, Zhang, Meng
Reconstruction of seismic data with missing traces is a long-standing issue in seismic data processing. In recent years, rank reduction operations are being commonly utilized to overcome this problem, which require the rank of seismic data to be a prior. However, the rank of field data is unknown; usually it requires much time to manually adjust the rank and just obtain an approximated rank. Methods based on deep learning require very large datasets for training; however acquiring large datasets is difficult owing to physical or financial constraints in practice. Therefore, in this work, we developed a novel method based on unsupervised learning using the intrinsic properties of a convolutional neural network known as U-net, without training datasets. Only one undersampled seismic data was needed, and the deep seismic prior of input data could be exploited by the network itself, thus making the reconstruction convenient. Furthermore, this method can handle both irregular and regular seismic data. Synthetic and field data were tested to assess the performance of the proposed algorithm (DSPRecon algorithm); the advantages of using our method were evaluated by comparing it with the singular spectrum analysis (SSA) method for irregular data reconstruction and de-aliased Cadzow method for regular data reconstruction. Experimental results showed that our method provided better reconstruction performance than the SSA or Cadzow methods. The recovered signal-to-noise ratios (SNRs) were 32.68 dB and 19.11 dB for the DSPRecon and SSA algorithms, respectively. Those for the DSPRecon and Cadzow methods were 35.91 dB and 15.32 dB, respectively.
Dilated deeply supervised networks for hippocampus segmentation in MRI
Folle, Lukas, Vesal, Sulaiman, Ravikumar, Nishant, Maier, Andreas
Tissue loss in the hippocampi has been heavily correlated with the progression of Alzheimer's Disease (AD). The shape and structure of the hippocampus are important factors in terms of early AD diagnosis and prognosis by clinicians. However, manual segmentation of such subcortical structures in MR studies is a challenging and subjective task. In this paper, we investigate variants of the well known 3D U-Net, a type of convolution neural network (CNN) for semantic segmentation tasks. We propose an alternative form of the 3D U-Net, which uses dilated convolutions and deep supervision to incorporate multi-scale information into the model. The proposed method is evaluated on the task of hippocampus head and body segmentation in an MRI dataset, provided as part of the MICCAI 2018 segmentation decathlon challenge. The experimental results show that our approach outperforms other conventional methods in terms of different segmentation accuracy metrics.